Increased Diphone Recognition for an Afrikaans TTS system

نویسندگان

  • Francois Rousseau
  • Daniel Mashao
چکیده

In this paper we discuss the implementation of an Afrikaans TTS system that is based on diphones. Using diphones makes the system flexible but presents other challenges. A previous effort to design an Afrikaans TTS system was done by SUN. They implemented a TTS system based on full words. A full word based TTS system produces more natural sounding speech than when the system is designed using other techniques. The disadvantage of using full words is that it lacks flexibility. The baseline system was build using the Festival Speech Synthesis System. Problems occurred in the baseline due to the mislabeling of diphones and the diphone index. The system was improved by manually labeling the diphones using Wavesurfer, and by changing the diphone index. Wavelength comparison tests were done on the diphone index to show how much of the diphones are recognized during synthesis. For the diphones tested results show an average improvement of 38% in the recognition of diphones compared to the baseline. These improvements improve the overall quality of the system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rapid development of an Afrikaans-English speech-to-speech translator

In this paper we investigate the rapid deployment of a twoway Afrikaans to English Speech-to-Speech Translation system. We discuss the approaches and amount of work involved to port a system to a new language pair, i.e. the steps required to rapidly adapt ASR, MT and TTS component to Afrikaans under limited time and data constraints. The resulting system represents the first prototype built for...

متن کامل

Speech Data Analysis for Diphone Construction of a Maori Online Text-to-speech Synthesizer

One of the main types of speech processing technologies today is text-to-speech (TTS) synthesis. A well established speech synthesizer technique called ‘diphone concatenation’ uses a speakers processed speech examples to apply a more human-like response to the TTS synthesis system. This methodology has been used to construct many diphone databases for various languages, and was the basis for bu...

متن کامل

A Diphone Sharing Method Towards Scalable Unit-training-based TTS

One of the most popular applications of Text to Speech (TTS) is in embedded devices. The resource limitation of embedded device requires the footprint of TTS system to be very small. Toshiba TTS for embedded device is a unit-training-based system and uses diphone as basic unit. The trained diphone inventory occupies a large part of the footprint. This paper proposes a diphone sharing method to ...

متن کامل

Implementation and evaluation of a text-to-speech synthesis system for turkish

In this paper, a diphone based Text-to-Speech (TTS) system for the Turkish language is presented. Turkish is the official language of Turkey, where it is the native language of 70 million people and it is also widely spoken in Asia (Azerbaidjain, Uzbekhstan, Kazakhstan, Kirgizhstan and Iran), Cyprus and the Balkans. The research has been done through a visiting internship at CSLR (the Center fo...

متن کامل

Extraction of Di-phones for Telugu ::Issues and solutions

This paper describes a method for extraction of diphones to generate diphone database for concatenative text to speech systems. Diphone is an adjacent pair of phones. Diphone is a very important resource for both text to speech [TTS] and speech to text [STT]. Consider the pronunciation of -kaaki. It consists of phonemes [k], అ [a], అ [a], [k], ఇ[i]. The diphones generated while pronouncing the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004